I've been mucking around with Perl since I overloaded my Dos system with it some years ago. This was frustrating so I wrote a C interpreter with built in regexes, automatic line splitting, list manipulation and a dumb terminal (in 56kb :-). Aside from playing about in Linux I didn't use Perl much.
With Perl 5, the Activestate release on Windows platforms and
its utility on the Internet, Perl has become much more usable and useful. I cheerfully abandoned my interpreter, now Perl is better :-).
Managing ActiveState HTML documentation
Any Perl hacker on Win32 grabs every ppm module that looks halfway interesting and the docs for it end up in the Table of Contents which gets very very Long
With a little help with some collapsable menu javascript from Matt Kruse and a bit of work with HTML::Parser we can create something a bit more managable.
This does cookies so it will keep the sections you are interested in open. gentoc.zip (about 7k)
Indexing your hard disk
This has been showing me lots of HTML pages that I've carefully filed away as useful but then forgotten.
Give indexdb.pl a path and a db name and let it run. Then look.pl takes a keyword and produces a framed view with the hits list on the left and the first page open on the right.
look.pl uses Win32::OLE to call Internet Explorer (but you don't need to do that if you don't want too).
and the style sheet is courtesy (they haven't complained) of Activestate, so it looks comfortably familiar.
indexdb.pl can take a stop list argument. What is on the path that you don't want to index (sometimes very important!).
If you need something especially complex for a regex here it will be better to hack the program than trying to get it off the command line unmangled. If you want more than one path, also, modify indexdb.pl as needed.
I use a junk.db just to poke around the undusted files from 6 months ago. Indexing is fast enough to make a one time database useful so playing with it is fun.
Featuritis: What about my text files? I want to be able to delete trash easily! I want a new query to go in the same window. Can we get neat dropdown menus and a folder button for files in the same directory? (yeah, maybe) indexer.zip (about 4k)
Yes, its yet another upload utility. This seemed to be a good starting place
for learning about the Internet and dealing with it in Perl. I have half a
dozen sites or so and besides the html there are backups of critical local
data to consider. This has a couple of features I needed, like how to ignore
certain files and directories, which works this way:
$ignore="\.log|\.bak|\.psp|\.zip|Archive|junk*";
Don't do anything with log files, backups, huge photoshop stuff, the Archive
directory or anything named junk. This regex is matched against the file list so I
keep the garbage out, or some of it anyway.
For the rest of it that tends to accumulate when you automate something like
backups there is a delete list:
$deletefirst=2; # 0-delete after upload, 1-delete before, 2 no
delete
The delete file is 'delete.txt', and it gets renamed to 'delete.bak' after
every operation. That's hardwired in since I forgot to change the switch one
time and wiped out a megabytes worth of newly uploaded zip files. It takes a deliberate
effort to get the delete function going each time. There's a switch called
$gimmeftplistonly=1; which is used to get a complete file list and
I pick what I don't want from that.
The module assumes you have a local directory that updates remote sites you can
get to using FTP. Setting it up easy, give it a directory name and the proper params and away
you go. It will create new directories as needed. I'm updating several sites using the same local
directory and use one small file per
destination. It looks like this:
use strict;
use Ftpmirror;
$gimmeftplistonly=0;
$localbase="C:/web";
$localname="BangkokWizard"; #this is your starting directory
$site="ftp.xoom.com";
$user="username";
$pass="password";
$ignore="\.log|\.bak|\.psp|test\.html|junk.*";
$logfile="bwwx.log";
$deletefirst=2; # 0-delete after upload, 1-delete before, 2 no delete
ftpmirror();
Make up a file like the above, drop Ftpmirror.pm in perl/lib and you're ready to
go. It uses Win32::Internet so you will need that. It also uses TeeOutput
for the log file but you can comment out that part easily. There are a couple
of variables you might want to play with if you need to debug something.
$gimmeLocallistonly=0; $showFTPresponse=0; $sitebase="public_html"; # might need this on some sites, handles it
properly
Ftpmirror.pm decides to upload on file size. I was originally using date but
this can fail when you're doing a lot of fast changes and size rarely does.
The date hooks are still in there is you want to do something about that.
I've been using it successfully for several months but do consider it Beta
till you've wrung it out for yourself. Comments are welcome but remember
this was my first Perl 5 project and is my first module so be nice. ;-) Todo
It could be a bit faster and an obvious way to do that is to check local
file times against the log file and ignore directories with no changes. It also could be more modular, also
but don't feel like fiddling with that yet.
get it here, Ftpmirror.pm (8.1 kb).
tracker.cgi
I wanted to track who was hitting my pages, which is straightforward using the $ENV{'REMOTE_HOST'}
and similar variables available to CGI, but with the addition of a little javascript borrowed :-) from Extreme
you get more useful infomation like a better referrer and screen resolution if the browser supports 1.2 javascript.
I've adapted it so a 'Who' variable in the query string determines the log file, so I can have serveral web sites running off of the same script.
Tracker outputs an image, a 1x1 transparent GIF by default, but the 'pic' variable changes the image.
Here's the Javascript:
and here is tracker.cgi
It's been working reliably for several months.
The script to output tables from the data is ugly but minimally functional.
Another version is in the works and will replace this one shortly. Right now the problem is that it works nicely
using my local version of Apache but dies on the remote system. :-(((.
Comments, Suggestions and requests go here robert@bangkokwizard.com
Images and Icons
I use the Apache as an icon for my CGI files
and this as an
icon for PM files. Yes, I know its lopsided, the original is the Ring Nebula
from the Hubble space telescope and is much prettier...Look here or on the Hubble Deep Space Wallpaper page
These
were done by Matt Kruse and they came out better than my own camel so
here, take them all icons.zip 5 kb.
Back to the TOP NewSnatch grabbing chunks of pages